ABHIDHA: An extended WordNet for Indo-Aryan Languages
نویسندگان
چکیده
A lexical knowledge base is an important component of any intelligent information processing system. The WordNet developed at the Cognitive Systems Laboratories at Princeton has served as a lexical reference system for natural language processing activities. The Indian language based activities at our institute mainly in text-to-speech synthesis and natural language generation from iconic inputs require the inclusion of additional features in the lexical reference system like phonology, word roots and etymological information. Our initial efforts have been in Hindi and Bengali but commonality of Indo Aryan Languages and the importance of these extra features lead us to believe that it is a worthwhile effort to build-up a WordNet for other Indo-Aryan languages containing these features. In this paper we speak of the issues relating to the structured design and development of a generalized extended WordNet for Indo Aryan languages with special reference to Hindi and Bengali.
منابع مشابه
Part Ii Implementation of Indo–aryan Lexicalnet : an Extended Wordnet for Hindi and Bengali
متن کامل
Introduction to Gujarati wordnet
Gujarati is one of the 22 official languages of India. It is an Indo-Aryan language descended from Sanskrit. Gujarati wordnet is being built using expansion approach with Hindi as the source language. This paper describes experiences of building Gujarati wordnet. Paper discusses basic features of Gujarati language and evaluates suitability of Hindi language for expansion approach. Various issue...
متن کاملBuilding a WordNet for Sinhala
Sinhala is one of the official languages of Sri Lanka and is used by over 19 million people. It belongs to the Indo-Aryan branch of the Indo-European languages and its origins date back to at least 2000 years. It has developed into its current form over a long period of time with influences from a wide variety of languages including Tamil, Portuguese and English. As for any other language, a Wo...
متن کاملDialects in the Indo-Aryan landscape
The Indo-Aryan language family currently occupies a significant region of the Indian subcontinent, its member languages being spoken in the bulk of North India, as well as in Pakistan, Bangladesh, Nepal, Sri Lanka, and the Maldives. The historical depth of the textual record and the geographical breadth of the Indo-Aryan linguistic area, the diversity of its languages (226 in all), and its many...
متن کاملWhy Indo-Aryan languages adapt English alveolars as reʈroflexes: Acoustic evidence from Punjabi
In Indo-Aryan languages, English loanwords containing the alveolar /t/ are always adapted as retroflex /ʈ/ [1]. It is argued that English alveolars share the cues of release burst with the retroflexes in Indo-Aryan languages [2]. However, no quantitative acoustic evidence is provided by [2] as to what acoustic cues of English alveolars are important for the speakers of Indo-Aryan languages to a...
متن کامل